Picture for Yuzhi Zhao

Yuzhi Zhao

Adaptive Rollout Allocation for Online Reinforcement Learning with Verifiable Rewards

Add code
Feb 03, 2026
Viaarxiv icon

Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward

Add code
Jan 31, 2026
Viaarxiv icon

VP-Bench: A Comprehensive Benchmark for Visual Prompting in Multimodal Large Language Models

Add code
Nov 14, 2025
Viaarxiv icon

From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training

Add code
Nov 11, 2025
Figure 1 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 2 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 3 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 4 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Viaarxiv icon

Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring

Add code
Jun 10, 2025
Figure 1 for Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring
Figure 2 for Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring
Figure 3 for Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring
Figure 4 for Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring
Viaarxiv icon

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Add code
May 05, 2025
Viaarxiv icon

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Add code
Dec 24, 2024
Figure 1 for ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation
Figure 2 for ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation
Figure 3 for ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation
Figure 4 for ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation
Viaarxiv icon

Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring

Add code
Dec 10, 2024
Figure 1 for Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring
Figure 2 for Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring
Figure 3 for Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring
Figure 4 for Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring
Viaarxiv icon

LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations

Add code
Dec 09, 2024
Figure 1 for LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations
Figure 2 for LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations
Figure 3 for LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations
Figure 4 for LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations
Viaarxiv icon

Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

Add code
Jul 18, 2023
Figure 1 for Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
Figure 2 for Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
Figure 3 for Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
Figure 4 for Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
Viaarxiv icon